Introduction

subdiagnosis <- readr::read_tsv(
  file.path("..", "..", "..", "data", "current", params$scpca_project_id, "single_cell_metadata.tsv"),
  show_col_types = FALSE
) |>
  dplyr::filter(scpca_sample_id == params$sample_id) |>
  dplyr::pull(subdiagnosis) |>
  unique()

The aim is to explore the clustering and label transfers for the sample SCPCS000186. This sample has a(n) Anaplastic subdiagnosis.

In order to explore the clustering results, we look into some marker genes, pathways enrichment and label transfer.

This approach would provide us a rapid idea of the quality of the clustering:

  • in the case each cluster do have a set of specific marker genes, we could expect each of the cluster to have at least a different phenotype,
  • if marker genes are mostly shared between clusters, we might have over-clustered,
  • if no marker genes are found, the quality of cells in the clusters might be impaired (high mitochondrial content? ribosomal content?)

Packages

library(Seurat)
library(sctransform)
library(SCpubr) # for plotting
library(tidyverse)
library(patchwork)
library(msigdbr)
library(enrichplot)
library(clusterProfiler)
library(org.Hs.eg.db)
library(ggalluvial)

options(future.globals.maxSize = 8912896000000)
set.seed(params$seed)

Base directories

# The base path for the OpenScPCA repository, found by its (hidden) .git directory
repository_base <- rprojroot::find_root(rprojroot::is_git_root)

# The current data directory, found within the repository base directory
data_dir <- file.path(repository_base, "data", "current", params$scpca_project_id)

# The path to this module
module_base <- file.path(repository_base, "analyses", "cell-type-wilms-tumor-06")

Input files

In this notebook, we are working with the Wilms tumor sample defined in SCPCS000186 from the Wilms tumor dataset SCPCP000006. We work with the pre-processed and labeled Seurat object that is the output of 02b_label-transfer_fetal_kidney_reference_Stewart.Rmd saved in the results directory.

result_dir <- file.path(module_base, "results", params$sample_id)

To explore the clustering results, we look into some marker genes reported here:

CellType_metadata <- read_csv(file.path(module_base, "marker-sets", "CellType_metadata.csv"))

DT::datatable(CellType_metadata,
  caption = ("CellType_metadata"),
  extensions = "Buttons",
  options = list(
    dom = "Bfrtip",
    buttons = c("csv", "excel")
  )
)

Output file

Reports will be saved in the notebook directory. The pre-processed and annotated Seurat object per samples are saved in the result folder.

Functions

Here we defined function that will be used multiple time all along the notebook.

Visualize seurat clusters and metadata

For a Seurat object objectand a metadata metadata, the function visualize_metadata will plot FeaturePlot and BarPlot

  • object is the Seurat object

  • metadata the gene or quantitative value to be plotted

  • group.by is the metadata used for grouping the violin plots

visualize_metadata <- function(object, meta, group.by) {
  if (is.numeric(object@meta.data[, meta])) {
    d <- SCpubr::do_FeaturePlot(object,
      features = meta,
      pt.size = 0.2,
      legend.width = 0.5,
      legend.length = 5,
      legend.position = "right"
    ) + ggtitle(meta)
    b <- SCpubr::do_ViolinPlot(srat,
      features = meta,
      ncol = 1,
      group.by = group.by,
      legend.position = "none"
    )

    return(d + b + plot_layout(ncol = 2, widths = c(2, 4)))
  } else {
    d <- SCpubr::do_DimPlot(object, reduction = "umap", group.by = group.by, label = TRUE, repel = TRUE) + ggtitle(paste0(meta, " - umap")) + theme(text = element_text(size = 18))
    b <- SCpubr::do_BarPlot(
      sample = object,
      group.by = meta,
      split.by = group.by,
      position = "fill",
      font.size = 10,
      legend.ncol = 3
    ) +
      ggtitle("% cells") +
      xlab(print(group.by)) +
      theme(text = element_text(size = 18))
    return(d + b + plot_layout(ncol = 2, widths = c(2, 4)))
  }
}

Visualize seurat clusters and markers genes

For a Seurat object objectand a features features, the function visualize_feature will plot FeaturePlot and ViolinPlot

  • object is the Seurat object

  • features the gene or quantitative value to be plotted

  • group.by is the metadata used for grouping the violin plots

visualize_feature <- function(object, features, group.by = "seurat_clusters") {
  feature_symbol <- AnnotationDbi::select(org.Hs.eg.db,
    keys = features,
    columns = "SYMBOL",
    keytype = "ENSEMBL"
  )

  d <- SCpubr::do_FeaturePlot(object,
    features = feature_symbol$ENSEMBL,
    pt.size = 0.2,
    legend.width = 0.5,
    legend.length = 5,
    legend.position = "right"
  ) + ggtitle(feature_symbol$SYMBOL)
  b <- SCpubr::do_ViolinPlot(srat,
    features = feature_symbol$ENSEMBL,
    ncol = 1,
    group.by = group.by,
    legend.position = "none",
    assay = "SCT"
  ) + ylab(feature_symbol$SYMBOL)

  return(d + b + plot_layout(ncol = 2, widths = c(2, 4)))
}

Calculate ModuleScore from a MSigDB dataset

For a Seurat object object, the function MSigDB_score will calculate a score using AddModuleScore() function for a MSigDB gene set gs_name in the category category

  • object is the Seurat object to calculate the score. Score will be added in the metadata of this object.

  • category is a MSigDB collection (https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp). values in c("H", "C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8")

  • gs_name is the name of a MSigDB gene set, e.g. "HALLMARK_P53_PATHWAY"

  • name is the name of the module

  • nbins is the number of bins for AddModuleScore to use. Its default value is 24, which matches Seurat’s default value.

MSigDB_score <- function(object, category, gs_name, name, nbin = 24) {
  set <- msigdbr(species = "human", category = category)
  set_list <- set %>%
    dplyr::filter(gs_name == gs_name) %>%
    dplyr::distinct(gs_name, gene_symbol, human_ensembl_gene) %>%
    as.data.frame()
  set_list <- list(set_list$human_ensembl_gene)
  suppressWarnings({
    object <- AddModuleScore(object, features = set_list, name = name, nbin = nbin)
  })
  return(object)
}

Enrichment analysis

Enrichment_plotaim to perform enrichment of the marker genes for each of the Seurat clusters and summarize the results in a dotplot.

  • category is the MSigDB category or collection to be used values in c("H", "C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8")

  • signatures is a list of marker genes per cluster

  • backgroundis the universe used for enrichment

Enrichment_plot <- function(category, signatures, background) {
  ## define genesets
  tryCatch(
    expr = {
      gene_set <- msigdbr(species = "human", category = category)
      msigdbr_set <- gene_set %>%
        dplyr::distinct(gs_name, ensembl_gene) %>%
        as.data.frame()
      cclust <- compareCluster(
        geneCluster = signatures,
        fun = enricher,
        TERM2GENE = msigdbr_set,
        universe = background
      )
      d <- dotplot(cclust, showCategory = 15) + scale_y_discrete(labels = function(x) str_wrap(x, width = 40))
      return(d)
    },
    error = function(e) {
      print("No enrichment")
    }
  )
}

do_Table_Heatmap

do_Table_Heatmap shows heatmap of counts of cells for combinations of two metadata variables

  • data seurat object
  • first_group is the name of the first metadata to group the cells
  • last_group is the name of the second metadata to group the cells
do_Table_Heatmap <- function(data, first_group, last_group) {
  df <- data@meta.data %>%
    mutate_if(sapply(data@meta.data, is.character), as.factor) %>%
    group_by(!!sym(first_group), !!sym(last_group)) %>%
    summarise(Nb = log(n()))

  p <- ggplot(df, aes(x = !!sym(first_group), y = !!sym(last_group), fill = Nb)) +
    geom_tile() +
    scale_fill_viridis_c() +
    theme_bw() +
    theme(text = element_text(size = 20)) +
    theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)) +
    guides(fill = guide_legend(title = " Number of \n cells (log)"))

  return(p)
}

Analysis

Load the pre-processed Seurat object

# open the processed rds object
srat <- readRDS(file.path(result_dir, paste0("02b-fetal_kidney_label-transfer_", params$sample_id, ".Rds")))
DefaultAssay(srat) <- "SCT"

# Add number of bins to use with Seurat::AddModuleScore().
# One sample in this project fails with the default number of bins, so we need to override it
if (params$testing) {
  seurat_nbins <- 5
} else if (params$sample_id == "SCPCS000197") {
  seurat_nbins <- 23 # default is 24, so this is very close
} else {
  seurat_nbins <- 24
}

Visualize seurat clusters

We expect up to 5 set of clusters:

  • blastema cancer cells
  • epithelium cancer and/or normal cells
  • stroma cancer and/or normal cells
  • immune cells
  • endothelial cells
d1 <- SCpubr::do_DimPlot(srat, reduction = "pca", group.by = "seurat_clusters", label = TRUE) + ggtitle("Seurat Cluster - pca")
d2 <- SCpubr::do_DimPlot(srat, reduction = "umap", group.by = "seurat_clusters", label = TRUE) + ggtitle("Seurat Cluster - umap")
v1 <- SCpubr::do_ViolinPlot(srat, features = c("subsets_mito_percent"), ncol = 1, group.by = "seurat_clusters", legend.position = "none")
v2 <- SCpubr::do_ViolinPlot(srat, features = c("detected"), ncol = 1, group.by = "seurat_clusters", legend.position = "none")
v3 <- SCpubr::do_ViolinPlot(srat, features = c("sum"), ncol = 1, group.by = "seurat_clusters", legend.position = "none")

d1 + d2 + (v1 + v2 + v3 + plot_layout(ncol = 1)) + plot_layout(ncol = 3, widths = c(2, 2, 4))

Cell cycle information

s.genes <- srat@assays$RNA@meta.data$gene_ids[srat@assays$RNA@meta.data$gene_symbol %in% cc.genes$s.genes]
g2m.genes <- srat@assays$RNA@meta.data$gene_ids[srat@assays$RNA@meta.data$gene_symbol %in% cc.genes$g2m.genes]
srat <- CellCycleScoring(srat, s.features = s.genes, g2m.features = g2m.genes, set.ident = FALSE, nbin = seurat_nbins)
visualize_metadata(srat, meta = "Phase", group.by = "seurat_clusters")
## [1] "seurat_clusters"

visualize_metadata(srat, meta = "S.Score", group.by = "seurat_clusters")

visualize_metadata(srat, meta = "G2M.Score", group.by = "seurat_clusters")

Look at specific genes

Here, we open the table of marker genes marker-sets/CellType_metadata.csv. Note: we do not expect to have a clear and nice pattern of expression for all of the following markers in every tumor. This is just ti get a few idea.

for (feature in CellType_metadata$ENSEMBL_ID[CellType_metadata$ENSEMBL_ID %in% rownames(srat@assays$SCT)]) {
  print(visualize_feature(srat, features = feature, group.by = "seurat_clusters"))
}

Look at specific pathways

TP53 pathway

here we will calculate a TP53 score using AddModuleScore and the genes of the HALLMARK_P53_PATHWAY gene set.

srat <- MSigDB_score(object = srat, category = "H", gs_name = "HALLMARK_P53_PATHWAY", name = "TP53_score", nbin = seurat_nbins)
visualize_metadata(srat, meta = "TP53_score1", group.by = "seurat_clusters")

DNA repair pathway

here we will calculate a DNA_repair score using AddModuleScore and the genes of the HALLMARK_DNA_REPAIR gene set.

srat <- MSigDB_score(object = srat, category = "H", gs_name = "HALLMARK_DNA_REPAIR", name = "DNA_repair_score", nbin = seurat_nbins)
visualize_metadata(srat, meta = "DNA_repair_score1", group.by = "seurat_clusters")

Note: Chemo-treated samples should have higher DNA-damage scores.

DROSHA target genes

srat <- MSigDB_score(object = srat, category = "C3", gs_name = "DROSHA_TARGET_GENES", name = "DROSHA_score", nbin = seurat_nbins)
visualize_metadata(srat, meta = "DROSHA_score1", group.by = "seurat_clusters")

DICER1 target genes

srat <- MSigDB_score(object = srat, category = "C3", gs_name = "DICER1_TARGET_GENES", name = "DICER1_score", nbin = seurat_nbins)
visualize_metadata(srat, meta = "DICER1_score1", group.by = "seurat_clusters")

Find marker genes for each of the seurat clusters

In addition to the list of known marker genes, we used an unbiased approach to find transcripts that characterized the different clusters.

We run DElegate::FindAllMarkers2 to find markers of the different clusters and manually check if they do make sense. DElegate::FindAllMarkers2 is an improved version of Seurat::FindAllMarkers based on pseudobulk differential expression method. Please check the preprint from Hafemeister: https://www.biorxiv.org/content/10.1101/2023.03.28.534443v1 and tool described here: https://github.com/cancerbits/DElegate

Of note, we won’t use it for annotation, this is just here to get an idea!

# this chunk is not run in testing (CI) since there are generally too few clusters for the method to work
feature_conversion <- srat@assays$RNA@meta.data
de_results <- DElegate::FindAllMarkers2(srat, group_column = "seurat_clusters")
# filter the most relevant markers
s.markers <- de_results[de_results$padj < params$padj_thershold & de_results$log_fc > params$lfc_threshold & de_results$rate1 > params$rate1_threshold, ]
# add gene symbol for easiest interpretation of the result
s.markers$gene_ids <- s.markers$feature
s.markers <- left_join(s.markers, feature_conversion, by = c("gene_ids"))
stopifnot(
  "Error joining gene ids and feature names" = identical(s.markers$feature, s.markers$gene_ids)
)
DT::datatable(s.markers,
  caption = ("marker genes"),
  extensions = "Buttons",
  options = list(
    dom = "Bfrtip",
    buttons = c("csv", "excel")
  )
)
# Select top 5 genes for heatmap plotting
s.markers <- na.omit(s.markers)
s.markers %>%
  group_by(group1) %>%
  top_n(n = 5, wt = log_fc) -> top5
# subset for plotting
cells <- WhichCells(srat, downsample = 100)
ss <- subset(srat, cells = cells)
ss <- ScaleData(ss, features = top5$feature)
p1 <- SCpubr::do_DimPlot(srat, reduction = "umap", group.by = "seurat_clusters", label = TRUE, repel = TRUE) + ggtitle("Seurat Cluster - umap")
p2 <- DoHeatmap(ss, features = top5$feature, cells = cells, group.by = "seurat_clusters") + NoLegend() +
  scale_fill_gradientn(colors = c("#01665e", "#35978f", "darkslategray3", "#f7f7f7", "#fee391", "#fec44f", "#F9AD03"))
p3 <- ggplot(srat@meta.data, aes(seurat_clusters, fill = seurat_clusters)) +
  geom_bar() +
  NoLegend()
common_title <- sprintf("Unsupervised clustering %s, %d cells", srat@meta.data$orig.ident[1], ncol(srat))
show((((p1 / p3) + plot_layout(heights = c(3, 2)) | p2)) + plot_layout(widths = c(1, 2)) + plot_layout(heights = c(3, 1)) + plot_annotation(title = common_title))

DT::datatable(top5[, c(1, 9, 11, 12)],
  caption = ("top 5 marker genes"),
  extensions = "Buttons",
  options = list(
    dom = "Bfrtip",
    buttons = c("csv", "excel")
  )
)

Enrichment analysis of marker genes

Here we perform enrichment analysis of the marker genes found in the previous section for each Seurat cluster.

We defined as universe/background all the genes expressed in the dataset, meaning the rownames of the Seurat object.

We used three gene sets from MSigDB:

  • the hallmark gene sets are coherently expressed signatures derived by aggregating many MSigDB gene sets to represent well-defined biological states or processes.
  • the C3 : regulatory target gene sets based on gene target predictions for microRNA seed sequences and predicted transcription factor binding sites.
  • the C8 : cell type signature gene sets curated from cluster markers identified in single-cell sequencing studies of human tissue.

We used enricher function from clusterProfiler to perform enrichment analysis.

# define background genes = universe for enrichment
background <- rownames(srat)
# Define gene signature per cluster
signatures <- list()
for (i in unique(s.markers$group1)) {
  signatures[[paste0("cluster ", i)]] <- s.markers$feature[s.markers$group1 == i]
}

Hallmarks MSigDB gene sets

Enrichment_plot(category = "H", signatures = signatures, background = background)

EMT signature should be enriched in stroma cluster. E2F/proliferation should be enriched in blastema cluster. MYC(N), TP53 must be enriched in blastema cluster.

C8 MSigDB cell type signature gene sets

Enrichment_plot(category = "C8", signatures = signatures, background = background)

The MSigDB C8 gene set is quite relevant for kidney and nephroblastoma annotations. Epithelial (cancer and normal) cells should be enriched in mature/adult kidney pathways while blastema cancer cells will show enrichment of fetal kidney development pathway / cap mesenchyme.

Cell annotation

Introduction to the four annotation strategies

The use of the right reference is crucial for label transfer and annotation. It is recommended that the cell types in the reference is representative to the cell types to be annotated in the query.

Wilms tumors can contain up to three histologies that resemble fetal kidney: blastema, stroma, and epithelia. [1, 2]. Because of their histological similarity to fetal kidneys, Wilms tumors are thought to arise from developmental derangements in embryonic renal progenitors.

For these reasons, we expect Wilms tumor cells to map to fetal cells, especially fetal kidney cells.

Here, we investigate annotations resulting from 4 different strategies:

  • the _processed.rds objects already contain automated annotations computed by the Data Lab:

  • the label transfers from two fetal references using Azimuth, that have been performed in the previous analysis

    • Using the human fetal atlas as a reference: This is the reference developed by Cao et al. provided by Azimuth for label transfer. The reference contain cells from 15 organs including kidney from fetal samples. The label transfer have been performed in the notebook 02a_fetal_full_reference_Cao_{sample_id}.Rmd

    • Using the human fetal kidney atlas: Stewart et al. created a human fetal kidney atlas. This reference contains only fetal kidney cells and has been precisely annotated by kidney experts. The label transfer have been performed in the notebook 02b_fetal_kidney_reference_Stewart_{sample_id}.Rmd

Annotation from SingleR (no cancer or kidney specific dataset)

visualize_metadata(srat, meta = "singler_celltype_annotation", group.by = "seurat_clusters")
## [1] "seurat_clusters"

Annotation from CellAssign (no cancer or kidney specific dataset)

visualize_metadata(srat, meta = "cellassign_celltype_annotation", group.by = "seurat_clusters")
## [1] "seurat_clusters"

Annotation from the fetal reference (Cao et al)

The fetal full reference (Cao et al.) provides three levels of annotations:

  • fetal_full_predicted.organ is one of the 15 organs used for building the reference. We were nicely surprised that, for many of the samples, most of the cells from the Wilms tumor cohort mapped to fetal kidney cells.

  • fetal_full_predicted.annotation.l1 holds predicted cell type annotation

  • fetal_full_predicted.annotation.l2 holds the combination of the predicted organ and predicted cell type annotation

visualize_metadata(srat, meta = "fetal_full_predicted.organ", group.by = "seurat_clusters")
## [1] "seurat_clusters"

visualize_metadata(srat, meta = "fetal_full_predicted.annotation.l1", group.by = "seurat_clusters")
## [1] "seurat_clusters"

visualize_metadata(srat, meta = "fetal_full_predicted.annotation.l2", group.by = "seurat_clusters")
## [1] "seurat_clusters"

Annotation from the kidney reference (Stewart et al)

The fetal kidney reference (Stewart et al.) provides two levels of annotations:

  • fetal_kidney_predicted.compartment is one of the 4 main compartments composing a fetal kidney. We expect immune and endothelial cells to be healthy (non-cancerous) cells identified easily with high confidence. We expect stroma and fetal nephron compartment to contain both normal and malignant cells.

    • immune cells

    • stroma cells

    • fetal_nephron cells

    • endothelial cells

  • fetal_kidney_predicted.cell_type holds predicted cell type annotations that details further the four compartments, as for example:

    • immune subtypes: myeloid cells, lymphocytes, …

    • stroma subtypes: mesenchymal stem cells, fibroblasts, …

    • fetal_nephron subtypes: podocytes, uteric bud, mesenchymal cells (that appear to be cap mesenchyme cells) …

    • endothelial cells

visualize_metadata(srat, meta = "fetal_kidney_predicted.compartment", group.by = "seurat_clusters")
## [1] "seurat_clusters"

visualize_metadata(srat, meta = "fetal_kidney_predicted.cell_type", group.by = "seurat_clusters")
## [1] "seurat_clusters"

Compare annotations with marker genes

Here we evaluate with marker genes the identification of endothelial, immune cells and other cancer and non-cancerous sub-populations.

markers <- list(
  "malignant" = CellType_metadata$ENSEMBL_ID[which(CellType_metadata$cell_class == "malignant")],
  "immune" = CellType_metadata$ENSEMBL_ID[which(CellType_metadata$cell_class == "immune")],
  "endothelium" = CellType_metadata$ENSEMBL_ID[which(CellType_metadata$cell_class == "endothelium")],
  "non-malignant" = CellType_metadata$ENSEMBL_ID[which(CellType_metadata$cell_class == "non-malignant")]
)

p1 <- SCpubr::do_DotPlot(
  sample = srat,
  group.by = "fetal_kidney_predicted.compartment",
  features = markers,
  axis.text.x.angle = 90,
  flip = TRUE,
  plot.title = "fetal_kidney_predicted.compartment"
)

p2 <- SCpubr::do_DotPlot(
  sample = srat,
  group.by = "fetal_kidney_predicted.cell_type",
  features = markers,
  axis.text.x.angle = 90,
  flip = TRUE,
  plot.title = "fetal_kidney_predicted.cell_type"
)


p3 <- SCpubr::do_DotPlot(
  sample = srat,
  group.by = "fetal_full_predicted.annotation.l1",
  features = markers,
  axis.text.x.angle = 90,
  flip = TRUE,
  plot.title = "fetal_full_predicted.annotation.l1"
)


p4 <- SCpubr::do_DotPlot(
  sample = srat,
  group.by = "fetal_full_predicted.annotation.l2",
  features = markers,
  axis.text.x.angle = 90,
  flip = TRUE,
  plot.title = "fetal_full_predicted.annotation.l2"
)

p5 <- SCpubr::do_DotPlot(
  sample = srat,
  group.by = "seurat_clusters",
  features = markers,
  axis.text.x.angle = 90,
  flip = TRUE,
  plot.title = "seurat_clusters"
)


p6 <- SCpubr::do_DotPlot(
  sample = srat,
  group.by = "cellassign_celltype_annotation",
  features = markers,
  axis.text.x.angle = 90,
  flip = TRUE,
  plot.title = "cellassign_celltype_annotation"
)

(p1 | p2) / (p3 | p4) / (p5 | p6)

Compare annotations between them

Annotation versus Seurat clusters

Here we check whether one (or more) Seurat cluster do correspond to one label. This would allow us to be more confident in the annotation of immune or endothelial cells for example.

p1 <- do_Table_Heatmap(srat,
  last_group = "fetal_kidney_predicted.compartment",
  first_group = "seurat_clusters"
)


p2 <- do_Table_Heatmap(srat,
  last_group = "fetal_full_predicted.annotation.l1",
  first_group = "seurat_clusters"
)


p3 <- do_Table_Heatmap(srat,
  last_group = "singler_celltype_annotation",
  first_group = "seurat_clusters"
)


p4 <- do_Table_Heatmap(srat,
  last_group = "cellassign_celltype_annotation",
  first_group = "seurat_clusters"
)

(p1 | p2) / (p3 | p4)

Comparison of the labels from the two fetal references

first level annotation
p1 <- do_Table_Heatmap(srat,
  last_group = "fetal_kidney_predicted.compartment",
  first_group = "fetal_full_predicted.annotation.l1"
)
p2 <- do_Table_Heatmap(srat,
  last_group = "fetal_kidney_predicted.cell_type",
  first_group = "fetal_full_predicted.annotation.l1"
)
p1 | p2

second level annotation
p2 <- do_Table_Heatmap(srat,
  last_group = "fetal_kidney_predicted.cell_type",
  first_group = "fetal_full_predicted.annotation.l2"
)
p2

Comparison of the fetal reference with SingleR and CellAssign

first level annotation
p1 <- do_Table_Heatmap(srat,
  last_group = "fetal_kidney_predicted.compartment",
  first_group = "cellassign_celltype_annotation"
)
p2 <- do_Table_Heatmap(srat,
  last_group = "fetal_kidney_predicted.compartment",
  first_group = "singler_celltype_annotation"
)
p1 | p2

second level annotation
p1 <- do_Table_Heatmap(srat,
  last_group = "fetal_kidney_predicted.cell_type",
  first_group = "cellassign_celltype_annotation"
)

p2 <- do_Table_Heatmap(srat,
  last_group = "fetal_kidney_predicted.cell_type",
  first_group = "singler_celltype_annotation"
)

p1 / p2

Session Info

# record the versions of the packages used in this analysis and other environment information
sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: aarch64-apple-darwin20
## Running under: macOS 15.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices datasets  utils     methods  
## [8] base     
## 
## other attached packages:
##  [1] ggalluvial_0.12.5      org.Hs.eg.db_3.19.1    AnnotationDbi_1.66.0  
##  [4] IRanges_2.38.1         S4Vectors_0.42.1       Biobase_2.64.0        
##  [7] BiocGenerics_0.50.0    clusterProfiler_4.12.6 enrichplot_1.24.4     
## [10] msigdbr_7.5.1          patchwork_1.2.0        lubridate_1.9.3       
## [13] forcats_1.0.0          stringr_1.5.1          dplyr_1.1.4           
## [16] purrr_1.0.2            readr_2.1.5            tidyr_1.3.1           
## [19] tibble_3.2.1           ggplot2_3.5.1          tidyverse_2.0.0       
## [22] SCpubr_2.0.2           sctransform_0.4.1      Seurat_5.1.0          
## [25] SeuratObject_5.0.2     sp_2.1-4              
## 
## loaded via a namespace (and not attached):
##   [1] fs_1.6.4                 matrixStats_1.3.0        spatstat.sparse_3.1-0   
##   [4] httr_1.4.7               RColorBrewer_1.1-3       tools_4.4.1             
##   [7] utf8_1.2.4               R6_2.5.1                 DT_0.33                 
##  [10] lazyeval_0.2.2           uwot_0.2.2               withr_3.0.1             
##  [13] gridExtra_2.3            progressr_0.14.0         cli_3.6.3               
##  [16] spatstat.explore_3.3-2   fastDummies_1.7.4        scatterpie_0.2.4        
##  [19] labeling_0.4.3           sass_0.4.9               spatstat.data_3.1-2     
##  [22] ggridges_0.5.6           pbapply_1.7-2            yulab.utils_0.1.7       
##  [25] gson_0.1.0               DOSE_3.30.5              R.utils_2.12.3          
##  [28] parallelly_1.38.0        limma_3.60.4             RSQLite_2.3.7           
##  [31] generics_0.1.3           gridGraphics_0.5-1       ica_1.0-3               
##  [34] spatstat.random_3.3-1    crosstalk_1.2.1          vroom_1.6.5             
##  [37] GO.db_3.19.1             Matrix_1.7-0             fansi_1.0.6             
##  [40] abind_1.4-5              R.methodsS3_1.8.2        lifecycle_1.0.4         
##  [43] edgeR_4.2.1              yaml_2.3.10              qvalue_2.36.0           
##  [46] Rtsne_0.17               grid_4.4.1               blob_1.2.4              
##  [49] promises_1.3.0           crayon_1.5.3             miniUI_0.1.1.1          
##  [52] lattice_0.22-6           cowplot_1.1.3            KEGGREST_1.44.1         
##  [55] pillar_1.9.0             knitr_1.48               fgsea_1.30.0            
##  [58] future.apply_1.11.2      codetools_0.2-20         fastmatch_1.1-4         
##  [61] leiden_0.4.3.1           glue_1.7.0               ggfun_0.1.6             
##  [64] spatstat.univar_3.0-0    data.table_1.16.0        vctrs_0.6.5             
##  [67] png_0.1-8                treeio_1.28.0            spam_2.10-0             
##  [70] gtable_0.3.5             assertthat_0.2.1         cachem_1.1.0            
##  [73] xfun_0.47                mime_0.12                tidygraph_1.3.1         
##  [76] survival_3.7-0           DElegate_1.2.1           statmod_1.5.0           
##  [79] fitdistrplus_1.2-1       ROCR_1.0-11              nlme_3.1-166            
##  [82] ggtree_3.12.0            bit64_4.0.5              RcppAnnoy_0.0.22        
##  [85] GenomeInfoDb_1.40.1      rprojroot_2.0.4          bslib_0.8.0             
##  [88] irlba_2.3.5.1            KernSmooth_2.23-24       colorspace_2.1-1        
##  [91] DBI_1.2.3                tidyselect_1.2.1         bit_4.0.5               
##  [94] compiler_4.4.1           httr2_1.0.3              plotly_4.10.4           
##  [97] shadowtext_0.1.4         scales_1.3.0             lmtest_0.9-40           
## [100] rappdirs_0.3.3           digest_0.6.37            goftest_1.2-3           
## [103] spatstat.utils_3.1-0     rmarkdown_2.28           XVector_0.44.0          
## [106] htmltools_0.5.8.1        pkgconfig_2.0.3          MatrixGenerics_1.16.0   
## [109] sparseMatrixStats_1.16.0 highr_0.11               fastmap_1.2.0           
## [112] rlang_1.1.4              htmlwidgets_1.6.4        UCSC.utils_1.0.0        
## [115] shiny_1.9.1              farver_2.1.2             jquerylib_0.1.4         
## [118] zoo_1.8-12               jsonlite_1.8.8           BiocParallel_1.38.0     
## [121] GOSemSim_2.30.2          R.oo_1.26.0              magrittr_2.0.3          
## [124] GenomeInfoDbData_1.2.12  ggplotify_0.1.2          dotCall64_1.1-1         
## [127] munsell_0.5.1            Rcpp_1.0.13              ape_5.8                 
## [130] babelgene_22.9           viridis_0.6.5            reticulate_1.38.0       
## [133] stringi_1.8.4            ggraph_2.2.1             zlibbioc_1.50.0         
## [136] MASS_7.3-61              plyr_1.8.9               parallel_4.4.1          
## [139] listenv_0.9.1            ggrepel_0.9.5            deldir_2.0-4            
## [142] Biostrings_2.72.1        graphlayouts_1.1.1       splines_4.4.1           
## [145] tensor_1.5               hms_1.1.3                locfit_1.5-9.10         
## [148] igraph_2.0.3             spatstat.geom_3.3-2      RcppHNSW_0.6.0          
## [151] reshape2_1.4.4           evaluate_0.24.0          renv_1.0.7              
## [154] BiocManager_1.30.25      tzdb_0.4.0               tweenr_2.0.3            
## [157] httpuv_1.6.15            RANN_2.6.2               polyclip_1.10-7         
## [160] future_1.34.0            scattermore_1.2          ggforce_0.4.2           
## [163] xtable_1.8-4             RSpectra_0.16-2          tidytree_0.4.6          
## [166] later_1.3.2              viridisLite_0.4.2        aplot_0.2.3             
## [169] memoise_2.0.1            cluster_2.1.6            timechange_0.3.0        
## [172] globals_0.16.3